Make shielded syncing a separate command #2422

batconjurer · 2024-01-22T15:01:31Z

Describe your changes

Previously, the scanning of MASP notes was done as part of various MASP commands. This moves that logic into a separate command so that it can be done out of band.

Indicate on which release or other PRs this topic is based on

#2363

Checklist before merging to `draft`

I have added a changelog
Git history is in acceptable state

murisi

Thanks for these changes, I think I understand the general flow. But maybe I am missing some of the higher level considerations that motivated this approach to separating MASP synchronization into a separate command. Some questions:

Have alternatives to interrupts been considered? Like continuously saving/appending new transactions to a file?
Given interrupt usage and SyncStatus modality, will the MASP functionality be easily usable from the SDK on all platforms (including the web)?
How do the ShieldedContext lock guard in the SDK and SyncStatus marker type interact? Is there any overlap or redundancy here?
How do we handle shielded context file consistency when multiple clients (from separate processes) are running?

murisi · 2024-01-24T06:35:35Z

crates/core/src/types/storage.rs

@@ -1471,6 +1463,22 @@ pub struct IndexedTx {
    pub index: TxIndex,
 }

+impl PartialOrd for IndexedTx {


What is the difference between this ordering of IndexedTx and the one from #[derive(Ord, PartialOrd)]? Isn't the latter also lexicographic? No?

I didn't trust the derived ord (especially if this struct changes in the future) and I wanted to be very explicit to other devs since this ordering is important.

let's leave a note about it (or a test)

batconjurer · 2024-01-24T09:57:17Z

Thanks for these changes, I think I understand the general flow. But maybe I am missing some of the higher level considerations that motivated this approach to separating MASP synchronization into a separate command. Some questions:
* Have alternatives to interrupts been considered? Like continuously saving/appending new transactions to a file?

* Given interrupt usage and SyncStatus modality, will the MASP functionality be easily usable from the SDK on all platforms (including the web)?

* How do the `ShieldedContext` lock guard in the SDK and `SyncStatus` marker type interact? Is there any overlap or redundancy here?

* How do we handle shielded context file consistency when multiple clients (from separate processes) are running?

I considered having a daemon process that would do such a thing. I think it's more work and will be more complicated to manage multiple process reading and writing to the same shielded context. It is something I'd like eventually, but Adrian didn't consider it a prioriy.
This functionality will only work in the CLI. I will pull the syncing logic up into apps to reflect this. For the web, something bespoke will need to be written
The SyncStatus is there so the syncing and non-syncing related functions are actually implemented on separate types. This was especially helpful in making sure I had gotten rid of all the fetch calls in client code. This wasn't caught by the lock guards.
I hadn't considered this point because my impression was that the existing Namada trait implementations covered these problems. If that isn't the case, indeed I will need to do more work checking this. In particular, I won't allow more than one process access to a shielded context at a time. This can be relaxed later.

grarco · 2024-01-24T13:26:27Z

crates/tests/src/e2e/ibc_tests.rs

@@ -187,7 +187,7 @@ fn run_ledger_ibc() -> Result<()> {
 }

 #[test]
-fn run_ledger_ibc_with_hermes() -> Result<()> {
+fn drun_ledger_ibc_with_hermes() -> Result<()> {


Small typo here

murisi · 2024-01-24T15:32:45Z

crates/sdk/src/masp.rs

-                            // Describe how a Transfer simply subtracts from one
-                            // account and adds the same to another
+                tx_ctx.scan_tx(*indexed_tx, *epoch, tx, stx)?;
+                self.unscanned.txs.remove(indexed_tx);


This process of removing transactions starting from the oldest and going forwards in time could be interrupted, right? If this were to happen, then the very first accepted transactions would not be in self.unscanned.txs from the point that break is called and afterwards. Right? My question is: if the client were then to again add more new unknown keys to their wallet, would fetch start the scan from the very beginning? Or would it start the rescan at the transaction that was interrupted (despite the keys being new/unknown up till this point)?

This is part of the bug I mentioned above. I have fixed this. If we have new keys, we need to scan from 0 to self.last_fetched. If some of those blocks, are already in the local cache, we can skip fetching them.

murisi · 2024-01-24T15:43:52Z

crates/sdk/src/masp.rs

+        std::mem::swap(&mut self.unscanned.txs, &mut txs);
+        let txs = ProgressLogging::new(txs, io, ProgressType::Scan);
+        for (indexed_tx, (epoch, tx, stx)) in txs {
+            if self.interrupted() {


If an interrupt happens here, would self.unscanned.txs be left in a state where it's missing the very first transactions? If this were the case, would this hamper the client's future ability to scan the very first transactions with viewing keys that are not yet known at this point?

Yes, the cache is cleared out. If new keys appear later, we have to repopoulate this cache.

murisi · 2024-01-24T15:47:03Z

crates/sdk/src/masp.rs

+                    last_query_height,
+                )
+                .await?;
+            self.unscanned.txs.extend(fetched);


Before this line, is self.unscanned.txs supposed to contain all transactions up till self.latest_unscanned()? After this line, is self.unscanned.txs supposed to contain all transactions up till last_query_height?

self.latest_unscanned() is supposed to tell us the most recent block height in the local cache. However, there is a bug here. We need to fetch from zero up to self.last_fetched since we are trying to sync up new keys with existing keys.

murisi · 2024-01-24T16:12:02Z

I considered having a daemon process that would do such a thing. I think it's more work and will be more complicated to manage multiple process reading and writing to the same shielded context. It is something I'd like eventually, but Adrian didn't consider it a prioriy.

I was thinking that there could be alternatives to interrupts other than daemon processes. Like for instance, we could do things as before but atomically commit the ShieldedContext to file storage (using ShieldedUtils::save) whenever we reach consistent states (like a transaction fetched into the context, or a transaction scanned into the context). This way if Ctrl-C is pressed or there's an OS crash, the last saved state is still preserved. I say this because reasoning about interrupts, when they can occur, how to leave/save the context in a consistent state, how to maintain correctness as more functionality is added to the shielded pool, and how to integrate it into web clients and other contexts may be more challenging than a model without channels and receivers.

murisi · 2024-01-24T16:31:02Z

The SyncStatus is there so the syncing and non-syncing related functions are actually implemented on separate types. This was especially helpful in making sure I had gotten rid of all the fetch calls in client code. This wasn't caught by the lock guards.

I see, this is a valid approach. That being said, though it may be undesirable in to do a sync just before displaying balances in our client. I wouldn't say that it is necessarily illogical (to the point of being a type error) for users of the SDK to interleave syncing and non-syncing operations for their own applications.

murisi · 2024-01-29T14:32:09Z

When working with you on the note scanning algorithm, I noticed two things:

The SyncStatus caused friction in our SDK usage when we were trying to save the ShieldedContext. We eventually worked around this issue by cloning the ShieldedContext before saving it (which should not be necessary since Borsh only requires a ShieldedContext reference to do serialization). We also had to introduce a PhantomData<SyncStatus> since the SDK code does not actually depend on the sync status.
We have essentially moved the interruption logic out of the SDK and into the apps crate. This means that the SDK no longer really deals with partiatially completed synchronizations in its interface.

In light of the above, it may be worth reconsidering if and how we do SyncStatus. If we are doing it for aesthetic reasons, then it should be fine. However if we are doing it to enforce correctness, then we should establish (in the absence of interrupts) what sequence of save, fetch, load, and gen_shielded_transfer operations will corrupt the internal state of the ShieldedContext (and render it incorrect/unusable) and therefore require the overhead of the type system to prevent. My general fear is that we may be ossifying the apps crate's latest ShieldedContext usage pattern (synchronization subcommand completely separated from other MASP actions) into the SDK unnecessarily, thereby creating friction for other SDK use cases (for instance a non-interactive program that repeatedly fetches and does useful shielded work).

If the SyncStatus must remain, then we should also consider moving it into the apps crate where such information may be more relevant. Or alternatively, we could move these SyncStatus API changes into a separate PR since this partitioning of ShieldedContext (rather than the creation of a separate syncing command) seems to form the bulk of this PR's diff. Doing this could make reviewing the sync subcommand changes easier. That being said, I'm happy to review whichever approach you decide upon.

brentstone · 2024-04-05T17:26:47Z

Closing since this has already been done.

batconjurer added 4 commits January 22, 2024 14:52

wip: Adding shielded sync types and futures

3b538e0

Factored sheilded sync into separate functions

ed76f16

[feat]: Added shielded sync

1b1e338

Rebased on v0.30

787ff53

batconjurer requested a review from grarco January 22, 2024 15:02

Fixing tests and checks

4417953

batconjurer mentioned this pull request Jan 23, 2024

Draft #2412

Closed

formatting

80978b8

batconjurer requested review from murisi and Fraccaman January 23, 2024 09:56

Fixed ibc e2e test

dfa3715

murisi reviewed Jan 24, 2024

View reviewed changes

grarco reviewed Jan 24, 2024

View reviewed changes

murisi reviewed Jan 24, 2024

View reviewed changes

murisi requested a review from tzemanovic January 24, 2024 17:12

This was referenced Jan 25, 2024

Draft 0.30.2 #2444

Closed

Draft 0.30.3 #2453

Closed

This was referenced Jan 30, 2024

Murisi/masp separate parallel sync #2474

Closed

Murisi/masp separate parallel sync rebased #2479

Closed

brentstone closed this Apr 5, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Make shielded syncing a separate command #2422

Make shielded syncing a separate command #2422

batconjurer commented Jan 22, 2024 •

edited

Loading

murisi left a comment

murisi Jan 24, 2024

batconjurer Jan 24, 2024

tzemanovic Jan 25, 2024

batconjurer commented Jan 24, 2024 •

edited

Loading

grarco Jan 24, 2024

murisi Jan 24, 2024

batconjurer Jan 25, 2024

murisi Jan 24, 2024

batconjurer Jan 25, 2024

murisi Jan 24, 2024

batconjurer Jan 25, 2024

murisi commented Jan 24, 2024 •

edited

Loading

murisi commented Jan 24, 2024 •

edited

Loading

murisi commented Jan 29, 2024

brentstone commented Apr 5, 2024

Make shielded syncing a separate command #2422

Make shielded syncing a separate command #2422

Conversation

batconjurer commented Jan 22, 2024 • edited Loading

Describe your changes

Indicate on which release or other PRs this topic is based on

Checklist before merging to draft

murisi left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

batconjurer commented Jan 24, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

murisi commented Jan 24, 2024 • edited Loading

murisi commented Jan 24, 2024 • edited Loading

murisi commented Jan 29, 2024

brentstone commented Apr 5, 2024

batconjurer commented Jan 22, 2024 •

edited

Loading

Checklist before merging to `draft`

batconjurer commented Jan 24, 2024 •

edited

Loading

murisi commented Jan 24, 2024 •

edited

Loading

murisi commented Jan 24, 2024 •

edited

Loading